Skip to content

Conversation

@naoyam
Copy link
Collaborator

@naoyam naoyam commented Jan 9, 2026

No description provided.

@github-actions
Copy link

github-actions bot commented Jan 9, 2026

Review updated until commit 5c9050d

Description

  • Enable TensorIndexer by default with "id_model" option in FusionExecutorCache

  • Simplify CUDA kernel generation by removing complex blockIdx.x indexing calculations

  • Update test to verify optimized tensor indexing behavior

  • Merge default enable options with user-provided options in execute method

Changes walkthrough

Relevant files
Enhancement
__init__.py
Enable TensorIndexer by default                                                   

python/nvfuser_direct/init.py

  • Add "id_model" as default enable option in FusionExecutorCache
  • Merge default enable options with user-provided _enable_options
  • Ensure TensorIndexer is enabled by default for all executions
  • +6/-1     
    Tests
    test_python_direct.py
    Update test for TensorIndexer optimization                             

    tests/python/direct/test_python_direct.py

  • Update test_fusion_execution_cache to verify simplified CUDA kernel
    generation
  • Remove complex blockIdx.x indexing calculations from generated kernel
  • Verify TensorIndexer optimization produces cleaner tensor indexing
  • +4/-6     

    PR Reviewer Guide

    Here are some key observations to aid the review process:

    🧪 PR contains tests
    ⚡ Recommended focus areas for review
    Breaking Change Impact

    The PR automatically adds "id_model" as a default enable option for all executions. This could be a breaking change that affects existing users' code behavior unexpectedly. Need to validate that "id_model" doesn't introduce unintended side effects or performance regressions for existing use cases.

    # Add "id_model" as a default enable option
    default_enable_options = ["id_model"]
    merged_enable_options = default_enable_options + _enable_options
    Kernel Generation Changes

    The CUDA kernel output has changed significantly - from using blockIdx.x calculations with i4 variable to conditional checks and different indexing patterns. This substantial change in code generation needs thorough validation to ensure correctness and that the new TensorIndexer implementation produces equivalent results.

    if ((((nvfuser_index_t)threadIdx.x) < 64)) {
      Array<float, 1, 1> T4;
      T4[0] = 0;
      T4[0]
         = T1[(((T1.alloc_stride[0LL] * i1) + (T1.alloc_stride[1LL] * i2)) + (T1.alloc_stride[2LL] * i3))];
      Array<float, 1, 1> T3;
      T3[0] = 0;
      T3[0]
         = T0[(((T0.alloc_stride[0LL] * i1) + (T0.alloc_stride[1LL] * i2)) + (T0.alloc_stride[2LL] * i3))];
      Array<float, 1, 1> T5;
      T5[0]
        = T3[0]
        + T4[0];
      T2[((nvfuser_index_t)threadIdx.x)]
         = T5[0];
    Performance Validation

    Since this PR enables TensorIndexer functionality, it should include performance benchmarks and data showing the benefits of this change. The PR should demonstrate that TensorIndexer provides measurable performance improvements over the previous implementation.

    return self.fec.execute(
        inputs,
        device=self._get_device_index(device),
        _enable_options=merged_enable_options,
        _disable_options=_disable_options,
    )

    Copy link
    Contributor

    @greptile-apps greptile-apps bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Greptile Overview

    Greptile Summary

    Enables the TensorIndexer (IdModel) by default for all Python direct tests by adding "id_model" as a default enable option in FusionDefinition.execute(). Simplifies the C++ option parsing logic in getIdModelEnabledOptions() by switching from an explicit opt-in model (checking for specific arguments like "consumer_index", "producer_index", etc.) to an opt-out model using "predicate_only" and "index_only" flags, removing redundant validation checks.

    Confidence Score: 4/5

    • Safe to merge - clean refactoring that enables IdModel by default for Python tests
    • The changes are straightforward and follow the pattern established in the previous PR (#5724) that enabled TensorIndexer for C++ tests. The C++ logic simplification removes complex conditional checks and validation code, making it easier to maintain. The Python change simply adds "id_model" to the default enable options list, which is non-breaking since users can still override via _enable_options/_disable_options parameters. No logical errors or edge cases detected.
    • No files require special attention

    Important Files Changed

    File Analysis

    Filename Score Overview
    csrc/id_model/utils.h 4/5 Simplifies IdModel option logic from explicit opt-in to opt-out with predicate_only/index_only flags
    python/nvfuser_direct/init.py 4/5 Adds "id_model" as default enable option for all FusionDefinition.execute() calls

    Sequence Diagram

    sequenceDiagram
        participant Python as Python FusionDefinition.execute()
        participant FEC as FusionExecutorCache
        participant CPP as getIdModelEnabledOptions()
        
        Python->>Python: "Add 'id_model' to default_enable_options"
        Python->>FEC: "execute(merged_enable_options)"
        FEC->>CPP: "Check IdModel enable option"
        CPP->>CPP: "Check !predicate_only → Enable Index opts"
        CPP->>CPP: "Check !index_only → Enable Predicate opts"
        CPP->>CPP: "Check both unset → Enable Loop opt"
        CPP-->>FEC: "Return enabled options set"
        FEC-->>Python: "Execute with TensorIndexer enabled"
    
    Loading

    @naoyam
    Copy link
    Collaborator Author

    naoyam commented Jan 9, 2026

    !test --diff

    @naoyam
    Copy link
    Collaborator Author

    naoyam commented Jan 9, 2026

    !test

    Copy link
    Contributor

    @greptile-apps greptile-apps bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Greptile Overview

    Greptile Summary

    Enables TensorIndexer by default for all python_direct tests by making IdModel always build and removing the disable option. The PR removes the build_id_model_ flag from IdModelOptions and the DisableOption::IdModel enum, unconditionally building IdModel during lowering. Python's FusionDefinition.execute() now adds "id_model" as a default enable option.

    Confidence Score: 2/5

    • Breaking change to IdModel option semantics requires test updates
    • The PR makes significant changes to how IdModel options are interpreted. The new getIdModelEnabledOptions() function changes from an opt-in model (explicit arguments like "all", "index") to a negation-based model ("predicate_only", "index_only"). This breaks existing usage in tests/python/direct/test_with_id_model_indexer.py which uses NVFUSER_ENABLE=id_model(all), but the "all" argument is no longer recognized. The old code would enable specific options based on explicit arguments, while the new code enables everything by default unless restricted. This semantic change warrants careful testing before merge.
    • csrc/id_model/utils.h - breaks backward compatibility with option arguments; tests/python/direct/test_with_id_model_indexer.py - needs update for new semantics

    Important Files Changed

    File Analysis

    Filename Score Overview
    csrc/device_lower/id_model_options.h 4/5 Removes build_id_model_ flag and related methods, simplifying the options structure
    csrc/device_lower/lower2device.cpp 4/5 Unconditionally builds IdModel by removing the conditional check for buildIdModel()
    csrc/id_model/utils.h 4/5 Simplifies option logic by checking EnableOption::IdModel first and using simpler argument checks
    python/nvfuser_direct/init.py 4/5 Adds "id_model" as default enable option for all fusion executions

    Sequence Diagram

    sequenceDiagram
        participant Python as Python FusionDefinition
        participant FEC as FusionExecutorCache
        participant Lower as GpuLower
        participant IdModel as IdModel
        participant Options as IdModelOptions
        
        Python->>Python: "Add 'id_model' to enable options"
        Python->>FEC: "execute(inputs, enable=['id_model'])"
        FEC->>Lower: "analysis(fusion)"
        Lower->>IdModel: "Unconditionally construct IdModel"
        IdModel->>Options: "Initialize IdModelOptions"
        Options->>Options: "Check EnableOption::IdModel"
        alt IdModel not enabled
            Options->>Options: "Return empty options"
        else IdModel enabled
            alt predicate_only arg
                Options->>Options: "Enable predicates only"
            else index_only arg
                Options->>Options: "Enable indexing only"
            else No restrictive args
                Options->>Options: "Enable all options"
            end
        end
    
    Loading

    Comment on lines 30 to 53
    inline std::unordered_set<IdModelEnableOption> getIdModelEnabledOptions() {
    if (!isOptionEnabled(EnableOption::IdModel)) {
    return {};
    }

    std::unordered_set<IdModelEnableOption> opts;

    if (hasEnableOptionArgument(EnableOption::IdModel, "consumer_index") ||
    hasEnableOptionArgument(EnableOption::IdModel, "index") ||
    hasEnableOptionArgument(EnableOption::IdModel, "all")) {
    if (!hasEnableOptionArgument(EnableOption::IdModel, "predicate_only")) {
    opts.insert(IdModelEnableOption::ConsumerIndex);
    }

    if (hasEnableOptionArgument(EnableOption::IdModel, "producer_index") ||
    hasEnableOptionArgument(EnableOption::IdModel, "index") ||
    hasEnableOptionArgument(EnableOption::IdModel, "all")) {
    opts.insert(IdModelEnableOption::ProducerIndex);
    }

    if (hasEnableOptionArgument(EnableOption::IdModel, "inline_predicate") ||
    hasEnableOptionArgument(EnableOption::IdModel, "predicate") ||
    hasEnableOptionArgument(EnableOption::IdModel, "all")) {
    if (!hasEnableOptionArgument(EnableOption::IdModel, "index_only")) {
    opts.insert(IdModelEnableOption::InlinePredicate);
    }

    if (hasEnableOptionArgument(EnableOption::IdModel, "unswitch_predicate") ||
    hasEnableOptionArgument(EnableOption::IdModel, "predicate") ||
    hasEnableOptionArgument(EnableOption::IdModel, "all")) {
    opts.insert(IdModelEnableOption::UnswitchPredicate);
    }

    if (hasEnableOptionArgument(EnableOption::IdModel, "loop") ||
    hasEnableOptionArgument(EnableOption::IdModel, "all")) {
    if (!hasEnableOptionArgument(EnableOption::IdModel, "predicate_only") &&
    !hasEnableOptionArgument(EnableOption::IdModel, "index_only")) {
    opts.insert(IdModelEnableOption::Loop);
    }

    // Loop requires ConsumerIndex, ProducerIndex, InlinePredicate and
    // UnswitchPredicate
    if (opts.find(IdModelEnableOption::Loop) != opts.end()) {
    NVF_ERROR(
    opts.find(IdModelEnableOption::ConsumerIndex) != opts.end(),
    "ConsumerIndex required for Loop");
    NVF_ERROR(
    opts.find(IdModelEnableOption::ProducerIndex) != opts.end(),
    "ProducerIndex required for Loop");
    NVF_ERROR(
    opts.find(IdModelEnableOption::InlinePredicate) != opts.end(),
    "InlinePredicate required for Loop");
    NVF_ERROR(
    opts.find(IdModelEnableOption::UnswitchPredicate) != opts.end(),
    "UnswitchPredicate required for Loop");
    }

    return opts;
    }
    Copy link
    Contributor

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    The new implementation changes the semantics of EnableOption::IdModel arguments. The old code supported explicit arguments like "all", "index", "predicate", "consumer_index", "producer_index", etc. The new code only recognizes "predicate_only" and "index_only" as restrictive flags.

    This breaks existing usage like NVFUSER_ENABLE=id_model(all) found in tests/python/direct/test_with_id_model_indexer.py:183. Consider either:

    1. Updating the test to use NVFUSER_ENABLE=id_model (no arguments, which enables everything by default)
    2. Adding backward compatibility for the "all" argument

    @naoyam
    Copy link
    Collaborator Author

    naoyam commented Jan 9, 2026

    !test

    Copy link
    Contributor

    @greptile-apps greptile-apps bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Greptile Overview

    Greptile Summary

    This PR enables TensorIndexer with all Python direct tests by making IdModel always-on and simplifying its configuration.

    Key Changes:

    1. Removes DisableOption::IdModel - IdModel is now always built during GPU lowering, cannot be disabled
    2. Simplifies IdModelOptions - Removes build_id_model_ flag and related methods, streamlines consistency checking
    3. Updates enable option parsing - Replaces granular options (consumer_index, producer_index, etc.) with simpler predicate_only/index_only flags
    4. Migrates from ComputeAtMap to IdModel APIs - Changes predicate_compute.cpp to use idModel().idGraph() instead of caMap() for getting concrete IDs and checking mappings
    5. Enables id_model by default in Python API - Adds "id_model" to default enable options for FusionDefinition.execute()
    6. Updates test expectations - Adjusts expected variable names and generated CUDA code to match new indexing behavior

    The changes are part of a larger effort to transition from ComputeAtMap to IdModel/TensorIndexer as the primary indexing mechanism.

    Confidence Score: 5/5

    • Safe to merge - well-tested refactoring that enables TensorIndexer consistently across Python tests
    • This PR removes the ability to disable IdModel and simplifies its configuration. All changes are internally consistent: DisableOption::IdModel is removed from enum, options map, and all conditional checks. The migration from caMap to idModel APIs in predicate_compute.cpp follows established patterns seen elsewhere in the codebase. Test updates reflect expected behavior changes from the new indexing approach. The Python API change safely adds id_model to default options (duplicates are handled gracefully by the options map).
    • No files require special attention

    Important Files Changed

    File Analysis

    Filename Score Overview
    csrc/device_lower/id_model_options.h 5/5 Removes build_id_model option and related code, simplifying IdModelOptions to always build IdModel
    csrc/device_lower/lower2device.cpp 5/5 Removes conditional IdModel building, now always builds IdModel during lowering
    csrc/id_model/utils.h 5/5 Simplifies getIdModelEnabledOptions to use predicate_only/index_only flags instead of granular options
    csrc/options.cpp 5/5 Removes id_model from DisableOptions map
    csrc/options.h 5/5 Removes IdModel enum value from DisableOption
    csrc/predicate_compute.cpp 5/5 Replaces caMap with idModel for getting concrete IDs and checking ID mappings
    python/nvfuser_direct/init.py 5/5 Adds id_model to default enable options for Python API execution
    tests/cpp/test_indexing.cpp 5/5 Updates expected index variable names in test (i98-i100 to i114-i116)
    tests/python/direct/test_python_direct.py 5/5 Updates expected CUDA kernel code to match new simplified indexing patterns

    Sequence Diagram

    sequenceDiagram
        participant User
        participant PythonAPI as Python FusionDefinition
        participant Lower as GpuLower
        participant IdModelOpts as IdModelOptions
        participant IdModel
        participant PredicateCompute
        participant TensorIndexer
    
        User->>PythonAPI: execute(inputs)
        PythonAPI->>PythonAPI: Add "id_model" to enable_options
        PythonAPI->>Lower: FusionExecutorCache.execute()
        
        Lower->>IdModelOpts: Construct with options
        Note over IdModelOpts: No longer checks DisableOption::IdModel<br/>Always initializes for TensorIndexer
        
        Lower->>IdModel: Build IdModel (unconditional)
        Note over IdModel: Previously conditional on buildIdModel()
        IdModel-->>Lower: IdModel graphs built
        
        Lower->>PredicateCompute: Generate predicates
        PredicateCompute->>IdModel: getConcreteMappedId(id)
        Note over PredicateCompute: Uses idModel().idGraph().toGroup()->front()<br/>Instead of caMap().getConcreteMappedID()
        IdModel-->>PredicateCompute: concrete ID
        
        PredicateCompute->>IdModel: Check ID mapping
        Note over PredicateCompute: Uses idGraph().disjointValSets().strictAreMapped()<br/>Instead of caMap().areMapped()
        IdModel-->>PredicateCompute: mapping result
        
        PredicateCompute-->>Lower: Predicates with TensorIndexer
        Lower-->>User: Compiled kernel
    
    Loading

    @greptile-apps
    Copy link
    Contributor

    greptile-apps bot commented Jan 13, 2026

    Greptile Overview

    Greptile Summary

    This PR enables the TensorIndexer (IdModel) by default for all Python direct API tests by modifying FusionDefinition.execute() to automatically prepend "id_model" to the enable options list before passing it to FusionExecutorCache.

    Implementation approach:

    • In python/nvfuser_direct/__init__.py, line 371-372: Creates default_enable_options = ["id_model"] and merges with user options via list concatenation
    • The prepending strategy ensures id_model is always enabled, even if users pass explicit _enable_options
    • Since C++ uses a map to store options (options_[option_type] = option), duplicate "id_model" entries are harmless (later set overwrites earlier)

    Test updates:

    • Updated expected CUDA kernel output in test_fusion_execution_cache() to reflect simpler code generation with id_model
    • Key differences: removed blockIdx-based index calculations, simplified array indexing to use threadIdx directly
    • The new indexer generates T1[(((T1.alloc_stride[0LL] * i1)... instead of T1[((((T1.alloc_stride[0LL] * i1)... + (4 * T0.alloc_stride[0LL]) * blockIdx.x))

    Codebase integration:

    • This change affects all users of FusionDefinition.execute() in nvfuser_direct (Python direct API)
    • IdModel is defined as an EnableOption in csrc/options.h:106
    • No corresponding DisableOption exists, so users cannot opt out once enabled

    Confidence Score: 5/5

    • Safe to merge - cleanly enables id_model by default with correct test updates
    • The implementation is straightforward and correct: prepends id_model to enable options, handles duplicates safely via map-based storage in C++, and updates test expectations to match new code generation. No breaking changes or runtime issues expected.
    • No files require special attention

    Important Files Changed

    File Analysis

    Filename Score Overview
    python/nvfuser_direct/init.py 4/5 Adds "id_model" to default enable options before merging with user-provided options. Prepends rather than appends to ensure id_model is always included.
    tests/python/direct/test_python_direct.py 5/5 Updates expected CUDA kernel output to match simpler indexing patterns generated by id_model indexer (removed blockIdx calculations, simplified array indexing).

    Sequence Diagram

    sequenceDiagram
        participant User
        participant FusionDefinition
        participant FusionExecutorCache
        participant EnableOptionsGuard
        
        User->>FusionDefinition: execute(inputs, _enable_options=[])
        FusionDefinition->>FusionDefinition: Prepend "id_model" to _enable_options
        Note over FusionDefinition: merged = ["id_model"] + _enable_options
        FusionDefinition->>FusionExecutorCache: execute(inputs, _enable_options=merged)
        FusionExecutorCache->>EnableOptionsGuard: set(IdModel) for each option
        Note over EnableOptionsGuard: Options stored in map (duplicates overwritten)
        EnableOptionsGuard->>FusionExecutorCache: Options configured
        FusionExecutorCache->>FusionExecutorCache: Compile & execute with IdModel enabled
        FusionExecutorCache-->>User: Return outputs
    
    Loading

    Copy link
    Contributor

    @greptile-apps greptile-apps bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    1 file reviewed, 1 comment

    Edit Code Review Agent Settings | Greptile

    Comment on lines +370 to +372
    # Add "id_model" as a default enable option
    default_enable_options = ["id_model"]
    merged_enable_options = default_enable_options + _enable_options
    Copy link
    Contributor

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Critical Logic Issue: Adding "id_model" to enable options without arguments has no effect.

    The IdModel system requires explicit sub-options to be passed as arguments (e.g., "all", "index", "consumer_index", "producer_index", etc.). These are checked via hasEnableOptionArgument(EnableOption::IdModel, "arg") in csrc/id_model/utils.h:30-80.

    When you add "id_model" here without arguments, the Python bindings call EnableOptionsGuard::getCurOptions().set(opt.value()) with an empty vector (see python/python_direct/runtime.cpp:284). This means no IdModel features are actually enabled.

    Evidence from codebase:

    1. C++ tests use: EnableOption::IdModel, {"all"} (tests/cpp/utils.cpp)
    2. Python tests that need IdModel use: set_env(NVFUSER_ENABLE="id_model(all)") (tests/python/direct/test_with_id_model_indexer.py:183)

    Proposed solution:
    The Python _enable_options API doesn't support passing arguments. You need to either:

    1. Modify the Python bindings to support syntax like ["id_model:all"] or similar
    2. Use environment variable NVFUSER_ENABLE="id_model(all)" in test setup
    3. Add a new Python API parameter specifically for IdModel options

    Without this fix, TensorIndexer will NOT be enabled with python_direct tests as the PR title claims.

    @naoyam naoyam changed the base branch from main to id_model_option_cleanup January 13, 2026 07:01
    @naoyam
    Copy link
    Collaborator Author

    naoyam commented Jan 13, 2026

    !test

    Copy link
    Contributor

    @greptile-apps greptile-apps bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    2 files reviewed, 2 comments

    Edit Code Review Agent Settings | Greptile

    Comment on lines +371 to +372
    default_enable_options = ["id_model"]
    merged_enable_options = default_enable_options + _enable_options
    Copy link
    Contributor

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    [P2] Potential duplicate options if user passes "id_model" in _enable_options. Consider checking if "id_model" is already in _enable_options before adding it, or use a set to avoid duplicates. While this doesn't cause functional issues (the second set() call just overwrites), it's cleaner to avoid duplicates.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

    Comment on lines +370 to +372
    # Add "id_model" as a default enable option
    default_enable_options = ["id_model"]
    merged_enable_options = default_enable_options + _enable_options
    Copy link
    Contributor

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    [P2] No opt-out mechanism provided. Users cannot disable id_model when using execute() even if they want to. Consider either: (1) checking if "id_model" is in _disable_options and skipping the default, or (2) providing a parameter to control default options. This limits flexibility for users who may need to test without id_model or work around potential id_model bugs.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

    Copy link
    Contributor

    @greptile-apps greptile-apps bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    1 file reviewed, 1 comment

    Edit Code Review Agent Settings | Greptile

    Comment on lines +370 to +372
    # Add "id_model" as a default enable option
    default_enable_options = ["id_model"]
    merged_enable_options = default_enable_options + _enable_options
    Copy link
    Contributor

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Adding "id_model" as a default enable option means users cannot opt-out of this behavior through the Python API. Consider whether this is intentional or if there should be a mechanism to allow users to disable id_model for testing/debugging purposes.

    Potential implications:

    • Users performing A/B testing between old and new indexing cannot easily do so
    • Debugging issues specific to id_model becomes harder without a way to disable it
    • This is a behavioral change that affects all python_direct users

    If this is intentional (to force id_model adoption), consider documenting this breaking change clearly. If not, consider checking if "id_model" is already in _enable_options or providing an opt-out mechanism.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

    Copy link
    Contributor

    @greptile-apps greptile-apps bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    No files reviewed, no comments

    Edit Code Review Agent Settings | Greptile

    @naoyam naoyam requested a review from jjsjann123 January 13, 2026 16:30
    Base automatically changed from id_model_option_cleanup to main January 14, 2026 00:32
    @naoyam naoyam merged commit cbc9450 into main Jan 14, 2026
    64 checks passed
    @naoyam naoyam deleted the tensorindexer_pytest branch January 14, 2026 00:32
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    None yet

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    3 participants